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ABSTRACT 



X 



We have developed a new three-dimensional algorithm, based on the standard 
P 3 M method, for computing deflections due to weak gravitational lensing. We com- 
pare the results of this method with those of the two-dimensional planar approach, 
and rigorously outline the conditions under which the two approaches are equivalent. 
Our new algorithm uses a Fast Fourier Transform convolution method for speed, and 
has a variable softening feature to provide a realistic interpretation of the large-scale 
structure in a simulation. The output values of the code are compared with those 
from the Ewald summation method, which we describe and develop in detail. With an 
optimal choice of the high frequency filtering in the Fourier convolution, the maximum 
errors, when using only a single particle, are about 7 per cent, with an rms error less 
than 2 per cent. For ensembles of particles, used in typical iV-body simulations, the 
rms errors are typically 0.3 per cent. We describe how the output from the algorithm 
can be used to generate distributions of magnification, source ellipticity, shear and 
convergence for large-scale structure. 

Key words: Galaxies: clustering — Cosmology: miscellaneous — Cosmology: grav- 
itational lensing — Methods: numerical — Large-scale structure of Universe 



; 1 INTRODUCTION 



Procedures for the generation of cosmological iV-body sim- 
ulations have become increasingly sophisticated in recent 
years. We have now developed an algorithm for assessing 
in great detail, and with considerable speed and accuracy, 
the effects of the large-scale mass distributions within these 
simulations on the passage of light from sources at great 
distances. 



1.1 Previous work 

There are numerous methods for studying such 'weak' grav- 
itational lensing, the most common being 'ray-tracing,' in 
which individual light rays are traced backwards from the 
observer, and the deflections occuring at each lens-plane are 
calculated. The lens-planes are two-dimensional projections 
of the mass content within a small redshift interval, usually 
equal to the simulation box depth. 

Schneider and Weiss (1988b) have used this method by 
shooting, typically, 10 8 rays through the lens-planes to strike 
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the source plane within a chosen square region, called the de- 
tection field. Each plane is divided into 51 2 pixels, typically, 
and the particles (stars in this case) are categorised as near 
or far for computational purposes. The deflection by stars 
further away than about 8 pixel dimensions from the centre 
of each pixel is approximated in a Taylor series, whilst the 
deflections due to the nearby stars are computed individu- 
ally. The rays are shot through a cylinder with a radius such 
that the rays meet the source plane in or near the detection 
field. They claim that the resulting amplification factors are 
hardly affected by edge effects caused by strong lensing of 
rays outside the shooting cylinder, but which might strike 
the detection field. The amplification factors are determined 
directly from the mapping of the rays onto the pixels of the 
detection field. 

Jaroszynski et al. (1990) evaluate the matter column 
density in a matrix of pixels for each of the lens-planes, based 
on their simulation boxes. By making use of the periodicity 
in the particle distribution orthogonal to the line of sight, 
they are able to arrange for each ray traced to be centralised 
within planes of one full period in extent. In this way, the 
deflections on each ray take account of all the matter within 
one complete period. They calculate, not deflection angles, 
but the two two-dimensional components of the shear, (see 
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Section 5.3), as ratios of the mean convergence of the beam. 
To do this, they assume that the matter in each of the 128 3 
pixels resides at the centre point of each pixel. To follow the 
shearing across subsequent planes they recursively generate 
the developing Jacobian matrix for each ray, in accordance 
with the multiple lens-plane theory, (see Section 5.3). 

Wambsganss (1990) uses the 'ray-tracing' method to 
study microlensing, Wambsganss, Cen and Ostriker (1998) 
use it with cosmological iV-body simulations, and Wambs- 
ganss et al. (1997) use the method for studying the disper- 
sion in type la supernovae. They randomly orient each sim- 
ulation box, and project the matter contained within each 
onto a plane divided into pixels. They choose the central 
8ft _1 Mpc x 8h _1 Mpc region through which to shoot rays, 
(h is the Hubble constant in units of 100 km s" 1 Mpc -1 ), 
but account for the deflections of the rays in terms of all the 
matter in the plane of 80fo _1 Mpc x 80/i _1 Mpc. However, to 
speed up the computation, a hierarchical tree code in two di- 
mensions is used to collect together those lenses (pixels) far 
away, whilst treating nearby lenses individually. The mat- 
ter in each pixel, which measures 10ft _1 kpc x 10h~ kpc, 
is assumed to be uniformly spread. The multiple lens-plane 
theory, (see Section 5.3), is used for large numbers of rays to 
compute the mappings of images and sources, the distribu- 
tion of magnifications, and statistics of angular separations 
of multiple images. 

Marri and Ferrara (1998) select a total of 50 planes, 
evenly spaced in redshift space, between redshifts of z = 
and 2 = 10. Their two-dimensional matter distribution 
on each plane consists of point masses without softening, 
so that their approach produces very high magnifications 
(greater than 20) for each of their chosen cosmologies, using 
the 'ray-tracing' method. 

Premadi, Martel and Matzner (1998) have used 5 differ- 
ent sets of initial conditions for each iV-body cosmological 
simulation, so that the plane projections of each simulation 
box can be chosen randomly from any one of the 5, and then 
randomly translated, using the periodic properties of each 
box. In this way they are able to avoid correlations in the 
large-scale structure between adjacent boxes. A consider- 
able improvement to the 'ray-tracing' method has then been 
made. They solve the two-dimensional Poisson equation on a 
grid, and invert the equation using a two-dimensional Fast 
Fourier Transform (FFT) method, to obtain the first and 
second derivatives of the gravitational potential on each 
plane. From this data, cumulative deflections and the de- 
veloping Jacobian matrix can be obtained, which provides 
the data for determining overall magnifications. 

An alternative to the conventional form of 'ray-tracing' 
was introduced by Refsdal (1970), who used the 'ray-bundle' 
method. The principle here is to trace the passage of a dis- 
crete bundle of light rays as it passes through the deflection 
planes. The advantage of this method is that it provides a 
direct comparison between the shape and size of the bundle 
at the observer and at the source plane, so that the magni- 
fication, ellipticity and rotation can be determined straight- 
forwardly. 

Fluke, Webster and Mortlock (1998a, b) use the 'ray- 
bundle' method, and have found that bundles of 8 rays plus 
a central ray adequately represent a circular image which is 
projected backwards. They shoot large numbers of bundles 
through two-dimensional planes, as above, formed by pro- 



jecting their simulation cubes, which have comoving sides of 
60/i _1 Mpc. The shooting area is limited to 50" x 50", and 
the deflection angles are calculated by considering the mat- 
ter contained within a chosen radius, typically 15ft _1 Mpc 
from the central ray. 

A novel approach to weak gravitational lensing has been 
used by Holz and Wald (1997). They lay down a set of 
spheres between the observer and source, each containing 
an individual probability distribution of matter, in which a 
'Newtonianly perturbed' Robertson- Walker metric is used. 
A scalar potential, related to the density perturbations, can 
then be evaluated, and this allows integration along straight 
lines through each sphere, to determine the angular devia- 
tions and shear. 

Tomita (1998a) evaluates the potential at some 3000 po- 
sitions between the observer and a source at z = 5, by using 
the periodic properties of each simulation cube to position 
it such that each evaluation position is centrally placed in 
the appropriate cube. To trace the paths of the light rays, 
they solve the null-geodesic equations, and use an analytical 
expression to determine the average potential through the 
interval between each pair of evaluation positions. 



1.2 Motivation 

Our motivation for the development of an algorithm to apply 
in three-dimensions has stemmed from concern with possi- 
ble limitations in the two-dimensional planar approaches to 
weak gravitational lensing. We therefore considered the fol- 
lowing. 

First, we wanted to investigate rigorously the conditions 
for equivalence of results obtained from three-dimensional 
realisations and two-dimensional planar projections. We 
show in Appendix B, that the shear values^ at a point in 
the two-dimensional projection are equal to the integrated 
three-dimensional values along the line of sight through one 
period (or cube dimension), in general, only if the distri- 
bution of mass is periodic along the line of sight, and the 
angular diameter distances through the depth of one period 
are assumed to be constant. 

Second, we wanted a method which would unambigu- 
ously provide accurate values for the shear components, as 
if the contribution from all matter, effectively stretching 
to infinity, was included. Errors may occur in other meth- 
ods if the contribution from matter within only a finite 
radius of the evaluation position is counted. For example, 
whilst Jaroszyhski at al. (1990) include the matter contained 
within a plane of one complete period, Wambsganss (1990), 
Wambsganss et al. (1997) and Wambsganss et al. (1998) 
introduce a slight bias, because rays near the edge of their 
shooting area are closer to the edge of the single period plane 
than rays near the centre of the shooting area. Fluke et al. 
(1998a, b) typically consider a region of radius 15/i -1 Mpc 
in simulations with a period of 60/i _1 Mpc. We investigate 
how quickly the shear component values converge to their 



t Note that, throughout this paper, we refer to the elements of 
the matrix of second derivatives of the gravitational potential as 
the 'shear' components, although, strictly, the term 'shear' refers 
to combinations of these elements which give rise to anisotropy. 
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true values, as increasing volumes of matter are included sur- 
rounding the evaluation position, and we report our findings 
in Section 4. 

Third, to achieve shear values consistent with those in 
a realistic universe, it is necessary to deal with the 'peculiar 
potential,' <f>, which is related to the full gravitational poten- 
tial, $, through the subtraction of a term depending upon 
the mean density. This approach, which we describe fully in 
Appendix A, ensures that we deal only with light ray deflec- 
tions arising from departures from homogeneity; in a pure 
Robertson- Walker metric we would want no deflections. The 
approach is equivalent to requiring that the net total mass 
in the system be set to zero, so that for every mass there 
is a balancing uniform negative background mass. This net 
zero mass requirement is achieved very simply in systems 
of particles which are periodically distributed in all dimen- 
sions, and we are therefore able to accommodate it easily in 
our algorithm. 

Fourth, we wanted to be able to apply our algorithm to 
cosmological simulations which were representative of the 
large-scale structure in the universe, rather than distribu- 
tions of point masses. This has frequently been attempted 
by assuming each particle to be softened with the profile 
of an isothermal sphere. We wanted to improve on this by 
allowing the softening for each particle to reflect the envi- 
ronment in which it is located, so that large clusters and 
other dense structures dominate the light deflections, whilst 
the effects of isolated particles are minimised. This moti- 
vated us to consider the introduction of a variable softening 
facility to our algorithm. 

Finally, a three-dimensional approach allows the use of 
the appropriate angular diameter distances at every single 
evaluation position within the three-dimensional realisation. 
The two-dimensional methods discussed above necessarily 
assume that all the lensing mass within a given plane is 
at the same angular diameter distance, because the overall 
depth of the lens is considered to be small compared with 
the observer-lens, observer-source, and source-lens angular 
diameter distances. However, the depth of a single simula- 
tion cube employed in weak lensing, (in our case 100/i _1 
Mpc), is not insignificant. By using the 'thin plane approxi- 
mation,' in which matter in a small redshift interval is pro- 
jected onto a plane, we are able to show that errors may be 
introduced. It is not possible to quantify these errors in gen- 
eral, because they will vary from simulation to simulation, 
depending on the specific mass distributions. However, we 
show in Section 4.2 that the scaling factors for the computed 
shear components can vary by as much as 9% through the 
depth of those simulation boxes contributing the most to 
the magnification. 

The considerations above finally led us to develop an 
efficient Fast Fourier Transform (FFT) programme, whose 
data output could be manipulated with the appropriate 
angular diameter distances at every evaluation point. The 
primary output of the programme is the matrix of second 
derivatives of the gravitational potential at each of the speci- 
fied evaluation positions within a periodic three-dimensional 
distribution of smoothed particle masses. 



1.3 Outline of paper 

In section 2, we describe the general principles employed in 
the standard Particle- Particle, Particle-Mesh (P 3 M) algo- 
rithm, and then how it has been extended for the evaluation 
of the shear components. We explain the introduction of 
variable softening into the code, which allows each parti- 
cle to be represented as a distributed mass. This variable 
softening smooths away the high frequency information in 
very high density clumps, thus avoiding strong scattering, 
and also allows particles in low density regions to be spread 
more widely to give more realistic density values. 

In section 3, we describe our testing procedures for the 
code. The first of these compares the computed shear com- 
ponents at a large number of points surrounding a single 
massive particle, with values derived from the Ewald (1921) 
summation technique. The second test compares values of 
the normalised trace of the shear matrix with density values 
derived using an independent method, a smoothed particle 
hydrodynamics, SPH, algorithm. 

Section 4 emphasises two advantages of our new algo- 
rithm. First, we demonstrate the slow convergence of shear 
values to their true values, as increasingly large volumes 
of matter are included surrounding an evaluation position. 
This suggests the need, in general, to include the effects of 
matter well beyond a single period transverse to the line of 
sight. Secondly, we show that, by considering all the matter 
in a cubic simulation to be at the same angular diameter 
distance, sizeable errors may be introduced to the absolute 
shear values, and to calculations of the magnification along 
a line of sight. 

Section 5 describes the cosmological iV-body simula- 
tions we are using for the application of the new algo- 
rithm, and explains our choice of the appropriate minimum 
value for the variable softening. The multiple lens-plane the- 
ory (described by Schneider, Ehlers & Falco, 1992) is sum- 
marised for the determination of magnification distributions 
along large numbers of lines of sight through the simula- 
tions. We then provide some preliminary results to show the 
efficacy of the method. These include (a) plots of the mag- 
nification as it develops along a line of sight, (b) values for 
the shear and convergence in a given cosmological simula- 
tion, and (c) distributions of the magnification due to weak 
lensing for isolated simulation boxes. Finally, we outline our 
proposed future work, which will link simulations together 
to provide a complete realisation from a distant source to an 
observer in the present epoch. This will enable us to compare 
the results from different cosmologies. 

Section 6 summarises our conclusions about the algo- 
rithm and its applicability. 

In Appendix A, we describe how the peculiar potential 
relates to that in a universe with large-scale homogeneity. 
We show how use of the peculiar potential, which takes ac- 
count of departures from homogeneity through the subtrac- 
tion of a term including the mean density, allows the shear 
to be correctly computed. 

In Appendix B, we investigate rigorously the equiva- 
lence between two-dimensional and three-dimensional peri- 
odic approaches, in the absence of discrete angular diameter 
distances within the realisations. The treatment details the 
conditions under which the two approaches may be consid- 
ered to be equivalent. 
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In Appendix C, we summarise the Ewald (1921) sum- 
mation method, which we have used to assess the accuracy 
of our new code. We describe the method in outline, and 
compare the treatment with the P 3 M method. Finally, we 
develop the equations for the summation method which we 
have used in the testing of the results from our new algo- 
rithm. 



2 IMPLEMENTATION 

In this section we describe the numerical method used for 
measuring the local 3-dimensional shear in simulation data. 
The technique is an extension of the standard P 3 M algo- 
rithm familiar from cosmological particle codes. We begin 
with a brief review of the P 3 M method and then describe 
how it has been extended for the shear calculation. 



2.1 The P 3 M algorithm 

The P 3 M algorithm was developed in the context of parti- 
cle simulations of plasmas by Hockney, Eastwood and co- 
workers, (see Hockney & Eastwood, 1988, for a full descrip- 
tion), as an efficient method for calculating the pairwise in- 
teractions of a large number of particles. In the cosmologi- 
cal context, with the method being used to calculate forces 
arising from a large number of self gravitating particles, the 
method has two important attractions. First, for a nearly 
uniform distribution of particles, the computational cost is 
of order N\og 2 N where N is the number of particles, rather 
than the normal 0(N 2 ) scaling behaviour expected for a 
naive computation of the forces on N particles from each 
of their (N — 1) neighbours. The second attractive feature 
for cosmology is that the method, in its standard form, has 
periodic boundary conditions, and thus lends itself natu- 
rally to the simulation of a small part of the universe with 
the remainder being approximated by periodic images of the 
fundamental volume. 

The key idea in the method is to decompose the pair- 
wise interparticle force into a long-range and a short-range 
component which together sum to the required force. With 
a suitable choice of the decomposition we can ensure that 
the short-range force is compact (that is, it is non-zero only 
within a finite radius, i.e., the 'search radius') and that the 
long-range component has a band-limited harmonic content 
such that it can be accurately represented by sampling with 
a regular grid of a convenient mesh size. The total force is 
then accumulated on particles by summing directly a con- 
tribution corresponding to the short-range component of 
the force from nearby particles within the search radius, 
together with the long-range component which is interpo- 
lated from a smoothly varying force derived from the regular 
mesh. 

In practice, the calculation of the short-range force 
(from the direct particle-particle (PP) sum over near neigh- 
bours) and the long-range force (from the particle-mesh 
(PM) field calculation) depend on each other only to the 
extent that the accumulated force from the two should sum 
to the required total force. The two calculations may be 
performed in either order: here we shall describe the PM 
calculation first. 



The heart of the PM calculation is the solution of Pois- 
son's equation on a grid via a rapid elliptic solver; the 
method used here is Fast Fourier Transform (FFT) convolu- 
tion. A mesh-sampled density is obtained by smoothing the 
particle distribution onto a regular grid with an appropri- 
ate kernel. The properties of the kernel are chosen to filter 
the high frequencies present in the distribution so that the 
smoothed distribution may be adequately sampled by the 
mesh. The mesh potential is then obtained from the mesh 
density by FFT convolution. An advantage of using an FFT 
method is that it is possible to substantially reduce trans- 
lational and directional errors in the mesh-computed quan- 
tities by judicious adjustment of the Fourier components of 
the Green's function. For the standard force calculation, the 
components of the Green's function are optimized such that 
the rms deviation of the computed force from the desired 
force is minimized. Full details of these techniques may be 
found in Hockney & Eastwood (1988). A key feature of the 
method is that the Fourier-transformed density field may be 
smoothed by using a high frequency filter. This suppresses 
aliasing and leads to a more accurate pairwise force; that 
is one which has less positional and rotational dependence 
relative to the grid. 

Derivatives of the potential may then be obtained at 
mesh-points using standard finite difference techniques. A 
10 point differencing operator is used here to minimize di- 
rectional errors in the computed differences, (see Couchman, 
Thomas & Pearce, 1995). Values of the potential and its 
derivatives at arbitrary points in the computational domain 
are then obtained by interpolation from the mesh values. 
(Using the same kernel for interpolation as was used for 
particle smoothing ensures that particles do not experience 
self-forces in the standard P 3 M method.) 

The accumulation of the PP component of the force on a 
particle from near neighbours is achieved by re-gridding the 
particles onto a mesh which has a cell size such that the side 
is equal to the radial distance at which the short-range force 
falls to zero. This mechanism enables the neighbours con- 
tributing to the short-range force to be found efficiently by 
searching over the cell in which the particle in question lies 
and its 26 neighbouring cells. A disadvantage of this tech- 
nique is that as particle clustering develops in a simulation 
the average number of neighbours rises, causing the method 
to slow as the number of PP contributions which must be 
computed increases. A technique to overcome this deficiency 
in simulation codes has been developed, (Couchman, 1991), 
but the problem will not be of concern in this paper where 
we shall be concerned only with limited clustering. 

The method leads to accurate interparticle forces with 
the force error (arising from the mesh aliasing) being con- 
trolled by the degree of high frequency filtering employed in 
the Fourier convolution. A greater degree of attenuation of 
the high frequency components reduces the error but leads 
to a 'smoother' mesh force. This requires that the direct- 
sum search be performed out to larger radii, which in turn 
requires a search over a greater number of particles leading 
to a slow-down in the execution of the code. 

The method may be described schematically in the fol- 
lowing terms. Suppose that the total pairwise potential re- 
quired is <fi = f(r), where r is the radial distance from a 
particle. Then we compute this as tp = <ppp + ifprn, where 
^pp(r) = for r > r c , is the PP component and <ppm is the 
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mesh part. The functional form of <p is Coulombic on large 
scales (neglecting for the moment the effect of the periodic 
images) , with perhaps a softening at small scales to allow for 
the fact that each particle may represent a very large astro- 
physical mass, and to ameliorate certain numerical problems 
in the simulation code such as two-body scattering. 



2.2 Adaptation of the P 3 M algorithm for the 
calculation of the shear components 

The P 3 M method computes forces, or first derivatives of 
the potential, <j>, at a point by splitting the contribution of 
the density distribution into two components as described 
above. The potential itself is also computed as a simulation 
diagnostic in many standard P M implementations using 
the same splitting technique. In principle, any other non- 
local function computed from the field may be treated in the 
same way, and this is the approach taken here for the shear 
components, d 2 cj>/dxidxj. The implementation details spe- 
cific to the calculation of the shear values will be discussed 
here. 

The short-range part of the shear field at a point is 
accumulated directly from neighbouring particles from the 
appropriate Cartesian projections of the analytic function: 

where the sum is over all neighbour particles and x is the 
separation of a neighbour particle from the point at which 
the shear is desired. (We have used the prime notation to 
denote derivatives with respect to radial separation.) The 
short-range part of the field may be computed to machine 
accuracy. 

The long-range part of the shear field is derived by tak- 
ing a second difference of the force values as computed in the 
standard P 3 M method. The only difficulty is that differenc- 
ing the mesh field magnifies the noise which is present as a 
result of aliasing. Reducing this noise requires more filtering 
in the Fourier domain with a corresponding increase in the 
short-range cutoff, r c . 

Optimization of the Green's function appropriate for 
the shear calculation is done in a manner similar to that 
used for the force calculation. We minimized the sum of 
the squares of the deviations of all nine components of the 
shear, although a number of other reasonable possibilities 
exist. Minimizing the deviation of the diagonal components, 
for example, produced results that were little different. 

It would be possible to compute the mesh shear com- 
ponents by inverse Fourier transform of —kikj<j>(k) directly 
for each i, j such that 1 < i < j < 3, thus avoiding the real- 
space differencing. It would still be necessary to filter the 
field, however, and for the differencing operator used, the 
very small increase in accuracy would not justify the added 
computational cost of several further FFTs. 



2.3 Particle softening 

An important feature of numerical particle codes is the use 
of particle 'softening'. The effect of this is that each particle 
in the code represents not a point mass but a distributed 



mass with some given (fixed) radial profile. Softening is in- 
troduced primarily to avoid artificial (or numerical) relax- 
ation; that is, close two-body encounters leading to spurious 
energy redistribution in the system. Since in most simula- 
tions we are attempting to model the cosmic matter density 
as a collisionless fluid, this is a useful approach. (Note that 
the particle softening referred to here is distinct from the 
high frequency filtering, or smoothing, employed in the PM 
part of the calculation.) 

In numerical particle simulation codes it is usual to em- 
ploy a global softening for all particles (which may, however, 
vary in time). As the particle distribution evolves and parti- 
cles cluster the low density regions are represented by fewer 
particles. In a simulation computing interparticle forces to 
evolve the distribution of particles, this is of little conse- 
quence. However, if we wish to compute the shear values at 
some position on a ray passing through a low density region 
it may never come within the softening of the widely spread 
particles in the region. Since we are interested in the trace 
of the shear matrix, which is equivalent to the density, this 
would suggest that the density at this point was zero. (In 
fact the density would be negative since the total mass in 
the periodic simulation cube must be zero.) This is unreal- 
istic and does not accurately represent the matter density 
in these regions. Increasing the softening would ameliorate 
this situation in the voids but would smooth away the high 
frequency information present in regions where the particles 
have clustered into high density lumps. 

The approach we have taken is to employ a variable 
softening, such that a particle in a region of low particle 
density has a 'size' which is greater than that of a particle 
in a high density region. We have chosen a criterion similar 
to that used in hydrodynamic simulations using the SPH 
method (e.g., Gingold & Monaghan, 1977). Each particle is 
chosen to have a softening such that its sphere of influence 
is proportional to the distance to its 32nd nearest neigh- 
bour. Given these values the code appropriately modifies the 
short-range (PP) calculation to increment the shear compo- 
nents at a given point taking into account the varying sizes 
of the neighbouring particles. 



3 PERFORMANCE AND TECHNICAL 
ASSESSMENT 

3.1 Pairwise shear tests 

A minimal check of the technique may be made by com- 
puting the shear components at a large number of points 
surrounding a single massive particle. This is a useful test 
because the result is known analytically and it provides an 
immediate assessment of the errors present in the method. 

The test was made as follows. A single massive particle 
was placed at a random location in a mesh cell and the code 
used to measure the shear components at 16384 surround- 
ing points located randomly in direction and distributed in 
radius such that there was an equal number of points per 
equal logarithmic increment in radius. The test was then 
repeated a further 34 times using the same evaluation po- 
sitions, but different locations for the test particle designed 
to adequately sample the mesh cell. The shear components 
measured at each location were then compared with the true 
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Figure 1. Computed shear components and errors in these com- 
ponents arising from the single particle test described in Section 
3.1. For a complete description of the test and of the features 
shown in the figure refer to the text. All separations, r, are ex- 
pressed in units of the system's period. 



values derived from the Ewald (1921) summation technique 
as described in Appendix C. These comparisons are plotted 
in Figure 1. 

Panel a) of Figure 1 shows the absolute value of the 
radial component of the shear (solid line) as well as the 
fractional error in this quantity (scattered dots). Panel b) 
shows the same quantities for the two transverse compo- 
nents. In both cases the measured values have been multi- 
plied by an appropriate power of the separation, r, such that 
a pure Coulombic potential would show no radial variation. 
For the filtering chosen, the maximum error in these quan- 
tities is approximately 7 per cent, and occurs near the mesh 
scale used. The test was performed using a mesh of 64 cells, 
but the errors are essentially independent of the mesh size 
for typical values of the softening. The cyclic error at small 
separations arises because linear interpolation into a look-up 
table is used for the short-range force. The few points lying 
above the main scatter of errors in panel b) are due to one 
of the transverse components becoming very small with the 
consequence that the fractional error can become large. The 



rms error in our test on a single particle is less than 2 per 
cent. 

Panel c) plots the absolute value of the trace of the 
Ewald-computed shear matrix and the fractional error of 
the computed trace values. The particle softening chosen for 
this test was 0.002 in units of the box side dimension and, 
because of the definition of the particle softening employed, 
this corresponds to the particle having a radial extent of 
0.004. The true value of the trace for smaller separations 
thus reflects 4-7T times the density of the particle as required 
by Poisson's equation. At larger separations the true density 
is -1 (since the total mass in the system has to be zero), and 
thus the Ewald-computed shear is 47T. Large errors occur in 
the trace values simply because the shear values individu- 
ally are many orders of magnitude larger than 47r and each 
is in error by a few percent; we thus cannot expect the sum 
of these large values to cancel to high precision to give the 
required result. This fact was one of the primary motiva- 
tions for using a variable softening in which every position 
at which the shear was measured would lie within the ef- 
fective radius of some particle. In this case we may expect 
that the computed values of the trace will have roughly the 
same fractional error as shown in panels a) and b). This is 
discussed and tested further in the next section. 

Finally, panel d) shows the directional error between the 
principal eigenvector of the computed shear matrix and the 
separation vector. (The banding for small directional errors 
is due to numerical quantization in the computation of the 
directional error.) It is apparent that directional errors are 
very small in this method. 

Note that the errors in the shear values computed from 
an ensemble of particles are, in general, much smaller than 
the pairwise errors shown in Figure 1. (For ensembles used 
in typical A^-body simulations, the rms force errors are typ- 
ically 0.3 per cent.) This is because the Fourier represen- 
tation of a general particle distribution will have a smaller 
high frequency content than the equivalent representation of 
a single massive particle and can thus be represented better 
by a given fixed grid. Only if the PP and PM forces almost 
exactly cancel will the fractional errors be larger although 
in this case the absolute error will be small. There will be 
no appreciable change in the error values with larger mesh 
configurations normally used with Af-body simulation data. 

3.2 Comparison of measured trace and 

overdensity for a distribution of particles 

For a more realistic test of the code, we compare the com- 
puted shear trace with the density evaluated using a stan- 
dard SPH algorithm for one of our cosmological simulation 
boxes, described in Section 5.1. The SPH programme eval- 
uates a parameter, I, at each particle in the simulation box, 
representing half the distance to the 32nd nearest neighbour. 
This parameter defines the volume for the density calcula- 
tion, and the same parameter is applied in the shear code to 
establish an appropriate value of the softening for the parti- 
cle. In addition, a specific smoothing function may be used 
to distribute the mass throughout the volume so defined. 

To make a suitable comparison with the shear trace val- 
ues, we have computed overdensity values from these den- 
sities, and compared the ratio of the overdensity and the 
trace with the overdensity values. In Figure 2, we plot the 
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Figure 2. Ratio of overdensity from an SPH programme to the 
normalised shear trace, against the binned overdensity. The aver- 
age values of the ratio are shown by the broken line; the full line 
is drawn at the value of unity for comparison. 



average value of this ratio in each overdensity bin. We have 
used a minimum value of 0.0005 for the variable softening, 
and analysed the data from 10000 particle positions. 

Because of the very different ways in which the densi- 
ties are determined (from the shear trace in the new code, 
and from particle numbers in the SPH programme, and the 
differing shapes of the softening functions used), we expect 
some dispersion in the values at all densities, and this is 
indicated by the la error bars. The form of the plot is eas- 
ily understood throughout its entire range. At low densities, 
the SPH density values are underestimated because isolated 
particles do not have the requisite number of nearest neigh- 
bours within the particle mesh. At high densities, the par- 
ticle softenings will be at the same minimum level for all 
the particles, retarding the amount of increase in the shear 
trace values as the real density continues to rise. This effect 
causes the upturn shown in Figure 2, which occurs at the 
overdensity value of 4.5 x 10 3 for a minimum softening of 
0.0005. 

The equality of the particle overdensities and shear 
trace values over more than three decades in density, gives 
us considerable confidence in the use of the shear algorithm 
generally, and for trace values determined at particle posi- 
tions, or within the softening range. 



Figure 3. One of the shear components, d 2 <f>/dxdy, as a function 
of the radial extent of matter included. 

line of sight, or to include the effects of matter over consid- 
erable distances from the line of sight. 

We investigated this rate of convergence of the shear 
values for one of our simulation boxes, (which we describe 
in Section 5.1). By using a straightforward direct-summation 
method for the particle contributions to the shear, we eval- 
uated one of the off-diagonal two-dimensional shear compo- 
nents as we progressively added mass out to a radial extent 
of 2.5 box units. Beyond a radius of 0.5 from the central posi- 
tion, the particles were laid down with the periodicity of the 
fundamental volume. The depth was one box unit through- 
out, because of the inbuilt periodicity along the line of sight. 
(We show in Appendix B that provided there is periodicity 
along the line of sight, the two-dimensional shear values will 
equate with the three-dimensional results integrated over a 
single period.) 

Figure 3 clearly shows that by including the matter 
within a single period only, (to a radius of 0.5), values for the 
shear components will, in general, be seriously in error. Of 
course, different simulations and particle distributions will 
display different rates of convergence to the limiting val- 
ues. However, it is quite clear that by making correct use of 
the periodicity in simulations (as an approximation to the 
distribution of matter outside of each simulation cube), to- 
gether with the net zero mass requirement, more realistic 
component values are achieved. Other approaches which do 
not employ these two conditions may suffer from inadequate 
convergence to the limiting values. 



4 SOME ADVANTAGES OF THE 
THREE-DIMENSIONAL METHOD 

4.1 Convergence to limiting values 

In the Introduction we outlined some of the considerations 
we made before developing the new three-dimensional algo- 
rithm. We expressed concern that shear values in general 
may converge only slowly to their true limiting values as in- 
creasingly large volumes of matter are included around an 
evaluation position. If so, it would be essential either to take 
full account of the periodicity of matter orthogonal to the 



4.2 The effects of angular diameter distances 

Our three-dimensional approach allows the use of the appro- 
priate angular diameter distances at every single evaluation 
position. This is not possible in two-dimensional approaches, 
where it is assumed that all the lensing mass is projected 
onto a plane at a single angular diameter distance. 

By definition, the angular diameter distance of a source 
is the distance inferred from its angular size, assuming Eu- 
clidean geometry. In an expanding universe, therefore, the 
angular diameter distance becomes a function of the redshift 
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of the source (and of the observer). In addition, the inclu- 
sion of excess matter within the beam causes the beam to 
become more focussed, and makes the source appear closer 
than it really is. By considering the universe to be populated 
by randomly distributed matter inhomogeneities, but resem- 
bling the Robertson- Walker, Friedmann-Lemaitre model on 
large scales, (see Schneider, Ehlers and Falco, 1992), a sec- 
ond order differential equation is obtained for the angular 
diameter distance, D, in terms of the density parameter, Q, 
for the universe, and the redshift, 2, of the source: 



( Z + l)(^ + l)g+(^ + § 



+ 3) + -QD = 0. 



(2) 



Dyer and Roeder (1973) made assumptions about the 
type of matter distribution to obtain a more general and 
practical equation. They assumed that a mass fraction, a, 
(called the smoothness parameter), of matter in the uni- 
verse is smoothly distributed, and that the fraction (1 — a) 
is bound into clumps. Then the equation for the angular 
diameter distance becomes 



(z + 1)( n, + i)g + (In, + H + : 



D = 0, 



(3) 



in which shear, a, is introduced by the matter distribution 
around the beam. They considered the following scenarios 
for the application of this equation. First, they considered 
a universe in which all the matter is bound into clumps, so 
that a = 0, and in which the light beam passes far away from 
the clumps. This is described as light propagating through 
an 'empty cone,' and gives rise to maximal divergence of the 
beam. The second scenario is more general and practical, in 
that it uses an intermediate value for the smoothness pa- 
rameter (0 < a < 1), but still requires the beam to pass 
far away from the clumps. In this case the beam contains a 
proportion of the smoothed matter distribution which intro- 
duces convergence, and hence some degree of focussing. The 
third scenario has a = 1, i.e., an entirely smooth universe. 
Here the smooth matter distribution is present within the 
beam, giving a 'full cone,' or 'filled beam' approximation. 

In all of these scenarios the term including the shear 
in equation (2) is minimised, so that the final 'Dyer-Roeder 
equation' becomes 

( z + i)(n* + i)0 + (^n* + ^+3) + |ani3 = o, (4) 

and can be solved for different values of Q and a. For Q — 1 
and a = 1 (filled beam), Schneider et al. (1992) quotes the 
result for the angular diameter distance between an observer 
at redshift z\ , and a source at redshift 2 2 , as 



D(z u z 2 ) = —2 
no 



1 



(1+Zl)5(l + Z 2 ) (1 + Z 2 )5 



(5) 



(6) 



D(zi,Z2) = —r(zi, z 2 ) 
no 

where r (21,22) is the dimensionless angular diameter dis- 
tance, c is the velocity of light, and Ho is the Hubble pa- 
rameter. 

Magnification values, p, derived using Dyer and 



Roeder's angular diameter distances will be affected accord- 
ing to the approximation used. For example, rays passing 
close to clumps or through high-density regions will result 
in magnification in any approximation. If the empty cone 
approximation is used, then jj, will be greater than 1, and if 
the full cone approximation is used, then /x will be greater 
than the mean magnification, < fi >, since these would be 
the respective values in the absence of lensing. Rays passing 
through voids will have p, = 1 in the empty cone approxi- 
mation (since the rays will be far from all concentrations of 
matter, and will satisfy the empty cone conditions). In the 
full cone approximation, fi < 1 because the rays will suffer 
divergence. However, the minimum value in this case will 
be, (Schneider et al, 1992), 



11 n 



D(z;a= 1) 



D(z;a = 0) 



(7) 



In addition, the mean value of the magnifications derived 
from a large number of lines of sight through a cosmological 
simulation should be close to unity in either the empty cone 
or the filled beam approximation. 

In the testing of our new algorithm we have used the 
filled beam approximation (a — 1) to obtain the angular di- 
ameter distances. With variable softening, most of the rays 
will pass through a slowly varying density field, justifying 
this choice, although the smoothness parameter should be 
different from unity, and possibly should evolve slowly with 
time. (Tomita, 1998b, finds, by solving the null-geodesic 
equations for a large number of pairs of light rays in four dif- 
ferent cosmological A^-body simulations, that the best value 
for a is almost equal to 1, although with considerable dis- 
persion.) However, the approximation that all rays should 
pass far from the clumps will not be strictly true, as shear 
on the light rays will be very much in evidence. 

We show in Figure 4 the value of the factor r d r ds /r s , 
where r d is the dimensionless angular diameter distance from 
the observer to the lens (here, the front face of each simula- 
tion box), r ds is that between the lens and the source, and 
r s is that for the observer-source, where we have taken the 
source to be at a redshift of 5. This factor, r d r da /r s , is used 
to multiply the shear component values generated in the 
code, and we see that it has a peak near 2 = 0.5 for a source 
redshift of 5. The curve is very steep near z — indicating 
large fractional differences between the value of r d r ds /r s at 
the front and the back of simulation boxes at late times, 
where considerable structure may also be present. 

To obtain the absolute shear component values we must 
also introduce scaling factors which apply to the simulation 
box dimensions. The appropriate factor is B(l + z) 2 r d r da /r a , 
where B — 3.733 x 10 -9 for the simulation boxes we have 
used, which have comoving dimensions of 100/i -1 Mpc. The 
(1 + z) 2 factor occurs to convert the comoving code units 
into real units. By evaluating this factor at the front and 
rear faces of each simulation box, we can obtain an esti- 
mate of the maximum error associated with projecting the 
mass distribution onto a plane. In Figure 5, we plot the 
percentage differences in this factor between the front and 
rear faces of each simulation box, and show the results for 
boxes of 50/i _1 Mpc, 100/i _1 Mpc, and 200/i _1 Mpc comoving 
depths. The figure clearly indicates the possible presence of 
large errors when boxes are treated as plane projections. 
The errors are considerable at high and low redshifts, and, 
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Figure 4. The factor r^r^ s /r s vs. box redshift, assuming a source 
redshift of 5. 



in particular, they are significant near z = 0.5, where the 
angular diameter factor rdTds/fs is greatest. For simulation 
boxes of 50/i -1 Mpc the difference is 4.5% near z = 0.5, for 
boxes of 100/i~ Mpc the difference is 9.0%, and for boxes of 
200h -1 Mpc the difference is 16.3%, for a source at z = 5. 

Obviously, the front-rear differences are smallest in the 
smallest boxes, (here 50h~ Mpc comoving depth), but with 
small simulation boxes there are problems in adequately 
representing the extent of large-scale structure, and, as we 
have seen in Section 4.1, a serious question as to whether 
two-dimensional shear values can be correctly determined 
by considering matter out to such small radii in the trans- 
verse direction. Even with 100ft -1 Mpc boxes we have shown 
that the two-dimensional shear values can be seriously in 
error when only matter within the fundamental volume is 
included. With 2QQh~ Mpc boxes, better convergence of val- 
ues may be obtained because of the larger spread of matter 
transverse to the line of sight; however, the range in the 
angular diameter distance factors along the line of sight is 
greater, introducing larger errors. To reduce this error, it 
may be thought that the boxes could be divided into a num- 
ber of planes; however, this procedure would give erroneous 
values for the shear because a full single period in depth is 
required, as we show in Appendix B. 

In Section 5.3 we show that an approximation for the 
magnification in weak lensing is 



(j, ~ 1 + (V>n + ^22), 



(8) 



where the tp values are the two-dimensional 'effective lensing 
potentials,' derived from integrating the three-dimensional 
components. Consequently, we may also find that the errors 
in the shear component values, arising from ignoring the 
angular diameter distance factors through the simulation 
boxes, enter into calculations of the magnification. 



APPLICATION OF THE CODE TO 
LARGE-SCALE STRUCTURE 
SIMULATIONS 



Figure 5. The percentage difference in the multiplying factor for 
the shear component values generated by the code, between the 
front and back faces of each simulation box. The figure shows 
the differences for simulation boxes of different comoving depths, 
highlighting possible errors associated with plane projections. 



5.1 Brief description of the LSS simulations used 

Our three-dimensional shear code can be applied to any 
three-dimensional distribution of point masses confined 
within a cubic volume. Each particle may be assigned an 
individual mass, although in our tests of the code, we have 
assumed all the particles to have the same mass. In addi- 
tion, the code allows for either a fixed softening value for 
each particle, or a variable softening, dependent on each 
particle's density environment. 

We applied the code to the data bank of cosmo- 
logical i V-body simulations provided by the Hy dra Con- 
sortium (ittp://coho. astro. uwo.ca/pub/data.html) and pro- 
duced using the 'Hydra' iV-body hydrodynamics code 
(Couchman, Thomas and Pearce, 1995). 

Our initial tests, described here, have used individ- 
ual time-slices from these simulations using 128 3 particles 
with a cold dark matter (CDM) spectrum in an Einstein- 
de Sitter universe. Each time-slice has co-moving sides of 
IOO/1" 1 Mpc. Since each is generated using the same ini- 
tial conditions, we arbitrarily translate, rotate and reflect 
each time-slice to prevent the formation of unrealistic corre- 
lations of structure along the line of sight, when the boxes 
are linked together. The simulations used have density pa- 
rameter Qq = 1 and cosmological constant Ao = 0. The 
power spectrum shape parameter, F, has been set to 0.25, 
as determined empirically on cluster scales (Peacock and 
Dodds, 1994), and the normalisation, as, has been taken as 
0.64 to reproduce the number density of clusters (Vianna 
and Liddle, 1996). The dark matter particle masses are all 
1.29 x lO 11 /!" 1 solar masses. 



5.2 The choice of softening 

We have chosen a softening function for the radial distribu- 
tion of mass for each particle, such that light rays feel the ex- 
istence of a smooth mass distribution. Our code also allows 
for variable softening, so that each particle may be assigned 
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its own softening-scale parameter, depending on the parti- 
cle number- density in its environment. In this way, it can 
be used to minimise the effects of isolated single particles, 
whilst the smoothed denser regions are able to represent the 
form of the large-scale structure. The parameter we have 
chosen to delineate the softening scale for each particle is 
proportional to I, where 21 is the radial distance to the par- 
ticle's 32nd nearest neighbour. The value of I is evaluated 
for every particle by applying our SPH density programme, 
as described in Section 3.2, to each simulation box. 

We allow the maximum softening to be of the order of 
the mesh dimension for isolated particles, which is defined 
by the regular grid laid down to decompose the short- and 
long-range force calculations. In this way the density values 
are improved, as we described in Section 2.3. This also means 
that individual isolated particles are unable to strongly in- 
fluence the computed shear values, in accordance with our 
need to study the broad properties of the large-scale struc- 
ture, rather than the effects of individual particles. 

Our new algorithm works with the ratio of the chosen 
softening (proportional to I) for each particle, to the max- 
imum value (dependent on the mesh size), so that the pa- 
rameter used has a maximum of unity. Our method, which 
employs the variable softening facility, contrasts markedly 
with that of other workers. As an example, Jaroszyhski et 
al. (1990), who evaluate deflections due to density columns 
projected onto a plane, apply no softening function, except 
to assume that all the mass within each column is effectively 
located at its centre. 

In the CDM simulations we have used, the minimum 
values for 21 are of order 10~ 3 ; e.g., for the redshift z — 
0.4986 box, the minimum 21 = 1.02383 x 10~ 3 , (equiva- 
lent to 68/i~ 1 kpc). This is comparable to the Einstein ra- 
dius, Re, for a large cluster of 1000 particles, (for which 
Re = 82/i _1 kpc for a lens at z = 0.5 and a source at z = 1). 
Consequently, by setting a working minimum value for the 
variable softening of 10 we would rarely expect to see 
strong lensing due to caustics in our simulations. Also, the 
radial extent of this minimum softening is of the order of 
galactic dimensions, thereby providing a realistic interpre- 
tation to the softening. 

Having justified our chosen value for a working min- 
imum value, it is also important to understand the sensi- 
tivity of our results to the input softening. Figure 6 shows 
the distribution of magnifications due to a single simulation 
box (z — 0.4986), and assumed source redshift of 1. The 
distributions using minimum softenings of 0.001 and 0.002 
are extremely close, whilst the differences are more apparent 
with the minimum softenings of 0.003 and 0.004. This is ex- 
tremely helpful, because it means that if we set the minimum 
softening to 0.001 in the z = box (equivalent to 100/i _1 
kpc) and to 0.002 in the z = 1.0404 box (also equivalent to 
lOO/i^ 1 kpc), then our results are likely to be little different 
from results using the same minimum softening throughout. 

To highlight the sensitivity to the minimum softening, 
we plot in Figure 7 the accumulating number of lines of sight 
having magnifications greater than or equal to the abscissa 
value. As expected, we see that the results using the small- 
est minimum softenings give rise to the highest maximum 
magnifications. 
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Figure 6. Magnification distributions for various minimum soft- 
enings, 0.001, 0.002, 0.003, and 0.004 in a simulation box at 
z = 0.4986, for a source at z = 1. 
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Figure 7. The accumulating number of lines of sight for which 
the magnification is greater than or equal to /i. The plots show 
the results of different minimum softenings in a simulation box 
at z = 0.4986, for a source at z = 1. 

5.3 Multiple lens-plane theory for magnification 
distributions 

There are two very important properties of our three- 
dimensional algorithm for shear which make it eminently 
suitable for use within particle simulations. 

First, each simulation box is treated as a periodic sys- 
tem, so that the contributions from all particles and their 
images are included in the shear computations. 

Second, as far as we are aware, this is the first algorithm 
successfully adapted for TV-body simulations, in which the 
shear components may be evaluated at a large number of 
locations throughout the extent of the box. In this way each 
of the selected locations may be considered as an individual 
deflection site, and deflections computed using individual 
angular diameter distances for each site. Two-dimensional 
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planar approaches, by contrast, are able to compute only 
one deflection for each ray at each projection plane, and such 
planes will be assumed to be at a single particular angular 
diameter distance. 

The first property of our algorithm gives confidence in 
the shear component values computed, whilst the second 
property enables us to trace the behaviour of rays through- 
out the full depth of each simulation box. 

To do this we construct a rectangular grid of directions 
through each box. Since we are dealing with small deflec- 
tions, and are interested only in the statistics of the output 
values, we consider each light ray to follow one of the lines 
defined by these directions through the box. The evaluation 
positions are specified along each of these lines of sight. 

The six independent second derivatives of the peculiar 
gravitational potential are calculated by the code at each 
of the selected evaluation positions throughout a simulation 
box. We then integrate the values in small chunks along each 
line, forming, essentially, a large number of planes through 
each simulation box. These integrated values form the input 
data to establish the elements of the Jacobian matrix, A, on 
each of the lines of sight for each of the deflection sites. 

We make use of the multiple lens-plane theory, which 
has been developed by Blandford and Narayan (1986), 
Blandford and Kochanek (1987), Kovner (1987), and Schnei- 
der and Weiss (1988a, b), and described in detail in Schnei- 
der et al. (1992). At the first deflection site from the source 
we evaluate the components of the Jacobian matrix, 



A = 



1- ipu 

— y>2i 



— V>12 
1 — ^22 



(9) 



in which the two-dimensional 'effective lensing potentials' 
are obtained from the three-dimensional second derivatives 
of the gravitational potential: 



D s 



1/ 



d 2 4>(z) 
dxidx. 



dz, 



(10) 



where Dd, Dds, and D s are the angular diameter distances 
from the observer to the lens, the lens to the source, and the 
observer to the source, respectively. At subsequent deflection 
sites we obtain the developing Jacobian matrix recursively, 
since the final Jacobian for N deflections is: 



A 



-total 



where X is the unitary matrix, 



V>21 V>22 



(11) 



(12) 



for the ith deflection, and each of the intermediate Jacobian 
matrices can be written as 



Aj =l-J2/3 ij U i Ai 



where 

Pl3 ~ D is Dj ' 



(13) 



(14) 



in which Dj , Di S and D%j are the angular diameter distances 
to the jth lens, that between the ith lens and the source, and 
that between the ith and jth lenses, respectively. 




Figure 8. Magnification vs. redshift along an arbitrary line of 
sight, using two different minimum softening values. The lower 
curve has a minimum softening of 0.001, and the upper curve has 
a minimum softening of 0.004. 



The magnification is, in general, 
/i = (AetAy 1 , 



(15) 



so that we can assess the magnification as it develops along 
a line of sight, finally computing the emergent magnification 
after passage through an entire box or set of boxes. For ex- 
ample, Figure 8 shows the development of the magnification 
through a single isolated simulation box (z — 0.4986) with a 
chosen source redshift of 1. The slightly different emerging 
magnifications arise because of the choice of different input 
minimum softening values. The figure shows an arbitrary 
line of sight, and we have assumed that the redshift varies 
linearly through the box. It should be noted that the mag- 
nifications derived using minimum softenings of 0.001 and 
0.002 are almost identical. 

The convergence, k, is defined by 



-(V'n +1P22) 



(16) 



from the diagonal elements of the Jacobian matrix, and 
causes isotropic focussing of light rays, and so isotropic mag- 
nification of the source. Thus, with convergence acting alone, 
the image would be the same shape as, but of larger size 
than, the source. 

The shear, 7, in each line of sight, is given by 



7 2 = ^(V'ii --fe) 2 + V>12- 

This is sometimes written in component form: 



7 



2 . 2 
7i + 72 , 



where 



7i = 
and 



^(Vn 



^22) 



(17) 



(18) 



(19) 



72 = Vi2- (20) 

Shear introduces anisotropy, causing the image to be a dif- 
ferent shape, in general, from the source. 
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Figure 9. Shear values vs. the convergence in a single simulation 
box, showing the rapid rise with k, and the flattening off when 
the minimum softening applies. 



Figure 10. A comparison of the magnification distributions for 
three different simulation boxes, placed at their correct distances, 
with a source at z s = 2. 



With weak lensing, and these definitions, the magnifi- 
cation reduces to 



fj, ~ 1 + (ipu +V22) = 1 +2k. 



(21) 



In the presence of convergence and shear, a circular source 
becomes elliptical in shape, with major and minor axes 



a = 

and 
b = 



(l-K-7)' 



1 



(l-K + 7)' 
so that the ellipticity, e, is given by 

b 1 — k — 7 
a 1 — k + 7 ' 

which reduces to 
e ~ 1 - 2 7 



(22) 



(23) 



(24) 



(25) 



in weak lensing. 

Distributions and relationships amongst all these quan- 
tities can be determined straightforwardly. As an example, 
Figure 9 shows the shear and convergence in a single (as- 
sumed isolated) simulation box, (z — 0.4986), with a source 
at redshift z s = 1. The minimum (variable) softening has 
been set at 0.001. As expected, the shear increases rapidly 
with increasing convergence, (density), until a maximum 
value is reached where the minimum softening applies. Mea- 
surements of the magnification and ellipticity show the lin- 
ear dependences on k and 7 according to equations (21) 
and (25), as expected. Small departures from linearity are 
apparent only at high values of k and 7. 

Figure 10 is an example of the magnification distribu- 
tions in three different simulation boxes, which are assumed 
to be isolated in space. We show the distributions for the 
z = box, the z = 0.4986 box, and the 2 = 1.0404 box, 
each with minimum softenings of 0.001. A source redshift of 
z s — 2 has been chosen. (All the distributions have a mean 
magnification value of 1.) 



maanification 1.255 
. magnification 1.692 
nification 7.670 




.05 

Magnification, ,.a 

Figure 11. A comparison of the magnification distributions in 
three different simulation boxes, all assumed to be located at the 
same redshift, 1.0404. 



In Figure 11 we show the results from the same three 
simulation boxes, for a source at z = 2, but here assumed to 
be all located at the same position (z = 1.0404). This allows 
direct comparisons between the boxes to be made in terms 
of the formation of structure within them. For example, the 
later boxes show higher values for the maximum magnifica- 
tions, and have shallower slopes in the distributions at the 
high magnification end. The peaks in the distributions for 
the later boxes occur at slightly lower magnification values, 
whilst all have mean magnifications of unity, as required. 

By a simple extension of the multiple lens-plane theory, 
we are now able to take the emergent ip values from each 
simulation box, and feed them into a string of subsequent 
boxes. In this way, we are able to obtain all the necessary 
emergent parameters at z = arising from a source at high 
redshift. We shall be reporting on these results in a future 
publication. 
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6 SUMMARY AND CONCLUSIONS 

In this paper we have discussed our motivations for devel- 
oping a new algorithm for use with cosmological iV-body 
simulations in the study of weak gravitational lensing. We 
have also described the algorithm we have developed to- 
gether with its variable softening refinement, and we have 
tested the output results from three-dimensional simulations 
against the Ewald (1921) summation method for the shear 
components. We have described how the results from the 
new code can be applied to realistic simulations by including 
the appropriate angular diameter distances at every evalua- 
tion position. In this way it is very straightforward to com- 
pute the final magnifications, source ellipticities, shear and 
convergence values as a result of the passage of light through 
linked simulation boxes. The main points we have discussed 
are the following. 

1. Appendix B rigorously shows that results from the 
two-dimensional and three-dimensional approaches to weak 
lensing are equivalent only if the mass distribution is pe- 
riodic along the line of sight, a full period (in depth) is 
considered, and the angular diameter distances are assumed 
constant throughout the depth. 

2. In order to evaluate the shear components correctly, 
it is necessary to work with the peculiar potential, which 
we describe in Appendix A. This applies equally to two- 
dimensional and three-dimensional methods. 

3. The results for two-dimensional planar projections 
may be invalid if matter outside the single period plane (in 
directions orthogonal to the line of sight) is not included. 
Because of the slow convergence of the potential and shear 
components to their limiting values, it is necessary, in gen- 
eral, to include the effects of matter well beyond a single 
period, but depending on the specific mass distribution. 

4. The inclusion of the appropriate angular diame- 
ter distances at every evaluation position within a three- 
dimensional realisation avoids errors in the shear and mag- 
nification values. The errors incurred by treating simulation 
boxes as planes may be as much as 9% in a single box of 
depth 100/i _1 Mpc at a redshift of 0.5 (where the lensing ef- 
fects are greatest, for a source at z = 5). (At low and high 
redshifts the fractional errors are greater.) 

5. The output from our algorithm is the three- 
dimensional shear components evaluated at a large number 
of positions within a periodic iV-body simulation cube. The 
code itself is a development of the standard P 3 M algorithm 
which determines forces (the first derivatives of the poten- 
tial), and the potential itself. The short-range part of the 
shear field at a point is accumulated directly from neigh- 
bouring particles, whilst the long-range part is obtained by 
taking a second difference of the force values. The computa- 
tional cost of the P 3 M method is low, being of order JVlog 2 iV 
rather than of order iV 2 . 

6. The PM calculation uses a FFT method in which the 
density distribution is smoothed, and can be well sampled 
by the mesh. The mesh potential is then obtained by FFT 
convolution. Errors in the method can be minimised by suit- 
able adjustment of the Fourier components of the Green's 
function. 

7. A key feature of the new algorithm is the facility to 
input a variable softening parameter. The feature enables 
particles in low density regions to have extended softening, 



so that nearby evaluation positions register a density rather 
than a complete absence of matter. By contrast, particles 
in highly clustered regions are assigned low softening val- 
ues, and a selected minimum softening value is introduced 
to avoid singular (strong lensing) behaviour. The variable 
softening feature enables a much more realistic depiction of 
the large-scale structure within a simulation to be made. 

8. In Appendix C we summarise the Ewald (1921) sum- 
mation method, and develop the equations for use as a com- 
parison with the values for shear obtained with our new 
algorithm. 

9. By choosing an appropriate filter, we are able to set 
limits for the maximum errors in the computed shear values 
from our code. In the tests, the maximum errors in both the 
radial and the transverse components of the shear are about 
7 per cent for the effects of a single particle, when compared 
with the values obtained using the Ewald formulas. The rms 
errors are less than 2 per cent, and errors for ensembles of 
particles, to which we intend to apply the code, are typi- 
cally 0.3 per cent (rms). The errors in the trace of the shear 
matrix can be large, because the trace frequently involves 
the addition of nearly equal and opposite (but large) values. 
Individually, however, the errors in each component remain 
small. 

10. We have tested the data also against the output of 
a completely different programme for the density at particle 
locations. There is good agreement for the normalised den- 
sity from this programme when measured against the shear 
trace from our new algorithm. 

11. The output from the code can be used together 
with the multiple lens-plane theory and appropriate angular 
diameter distances to obtain values for the magnification, 
source ellipticity, shear and convergence for a large num- 
ber of lines of sight as they emerge from a simulation box. 
We show a typical distribution plot for the emerging mag- 
nification, having given proper consideration to the desired 
minimum value for the variable softening. 

12. We commend the algorithm for use in periodic N- 
body simulations from which the data can be manipulated to 
obtain emergent values from linked simulation cubes cover- 
ing great distances. In this way, such a procedure also allows 
the comparison of results from different cosmologies. It is an- 
ticipated that our algorithm will become publicly available 
in enhanced form in due course. 
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APPENDIX A: PERIODICITY AND THE PECULIAR POTENTIAL 

A common method for modelling a section of the universe is to consider a distribution of masses within a triply periodic 
cube. In this appendix we examine how the peculiar potential in this model relates to that in a universe with large-scale 
homogeneity. 



Al The peculiar potential 

In Friedmann-Robertson- Walker models when considering the growth of perturbations, gravitational lensing by cosmic struc- 
ture, etc., we are interested in deviations from homogeneity. The quantity of interest in this case is the peculiar potential. 
This is derived in the usual way by writing the equation of motion, 

£ - -V,», (Al) 

for matter at position r, in terms of a comoving coordinate r = ax, where a is the expansion scale factor. The peculiar 
potential — the source for deviations from homogeneity — is 

(f, = $ + i/ 2 adx 2 . (A2) 

The rate of change of expansion velocity, d, is determined by the mean density of matter, p, leading to the familiar results 

V 2 (t> = 4ivGa 2 (p-p) (A3) 
and 

$ = $ -^/^Gapx 2 (A4) 



(for full details see, for example, Peebles, 1993). The peculiar potential arising from Poisson's equation (A3) corresponds to 
a system with zero net mass on large scales. 

Consider now a model universe of masses periodic in a cube of side L. The forces generated by the matter distribution 
satisfy F(x + Ln) = F(x) where n is an integer triple. Thus integrals, J l2 F.dS, where S is an outward normal, taken over 
opposite faces of the cube, sum to zero, and J F.dS over the surface of the cube vanishes. The divergence theorem and 
Poisson's equation, \7 2 (f> = 4irGp' , then imply 

4nG p'dV= V 2 (f>dV = - V.FdV = - / F.dS = 0, (A5) 

J L3 J L3 JL3 J.S 

and hence the total mass in the system is zero. 

This result is a consequence of solving Poisson's equation in a periodic cube. We see that the real result for the shear, 
obtained from the second derivatives of the peculiar potential, <j>, is related to the naive result based on the full gravitational 
potential, $, through the use of p' = p — p. 



Note that the zero mean density implicit in equation (AJ) is a result of the coordinate transformation; that this transfor- 



mation is well motivated is a reflection of the large-scale homogeneity of the universe. The zero mean density in equation ( A5), 
on the other hand, is simply a result of the imposed periodicity. The two views converge only for a periodic cube sufficiently 
large that the amplitude of the first few discrete Fourier modes are small enough that they describe a smooth transition to a 
zero mean value and homogeneity. 



A2 The peculiar potential in a periodic system 

We begin by demonstrating a useful result relating the Fourier transforms of continuous functions and the Fourier coefficients of 
the periodic functions constructed from them. First, as a convenient mechanism for translating between Fourier representations 
of continuous and periodic functions, we define the three-dimensional comb TTT (r) = <5(x — n), where n is an integer 
triple (see, for example, Hockney & Eastwood 1988). The Fourier transform of this function is the comb UI (k/27r) 
Consider the function, with period L, constructed by periodically repeating /: 

/t(x) = HI(x/L)*/(x) (A6) 
= ^/(x-nL). (A7) 



Applying the Fourier convolution theorem to equation (A6) we see immediately that the Fourier transform of the periodic 
function, is 

^ = inf T ^-V(k) J (A8) 



\L/2n 

where / is the continuous Fourier transform of /. 

Equation QAq) gives the Fourier series representation of the periodic function directly and demonstrates the familiar result 
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that the Fourier coefficients of the continuous periodic function, obtained by accumulating repeats of the continuous function 
(here ^ /(x — nL) obtained from /), are the same as the Fourier components of the continuous function at wavenumbers 
k = 2tt1/L. This result is the analogue in real space of aliasing in Fourier space. If the continuous function is zero outside the 
fundamental cell the periodic representation is simply obtained by tiling space with repeats of the fundamental cell. (This is 
the real-space analogue of a band-limited function.) 

Consider now N particles distributed in a cube of side L. Let the position of particle i be x;. The density is then 



P(x) 



E 



m i< 5(x~x j ), 



(A9) 



without, for the moment, requiring zero total mass. Where necessary, we can consider a periodic density distribution con- 
structed by tiling space with periodic repeats of the distribution in the fundamental cube: 



p'(x) = HI(x/L)*p(x). 

The gravitational potential at a point x in the periodic system is 
0(x) = GlII(x/L) * p(x) * p(x), 



(A10) 



(All) 



where ip is the pairwise 'interaction' potential (or Green's func tion). Equation (All) may be interpreted in two ways. We may 
consider a periodic distribution of matter as in equation (A10) convolved with the regular interaction potential. Alternatively, 
we can restrict attention to the matter in the fundamental zone and consider a modified interaction potential 



V9 + (x) = LH(x/L) *<p(x) 
<p(x — nL). 



n 

Both of these interpretations will be u seful. 

Using the result in equation (All), the Fourier series coefficients of <f> are, for k = 2n\/L and 1 an integer triple, 



</> k = Gp(k)£(k), 
where p(k) = ^ . m^e" 



(A12) 
(A13) 



(A14) 



is the continuous Fourier transform of the particle distribution in the fundamental zone (equa- 



tion (A9)) and <p(k) is the continuous Fourier transform of the interaction potential. 



Requiring the mean mass to be zero is equivalent to subtracting a component equal to ^. wij/I from the density in 

(A15) 



equation (A9), or setting <^k=o = in equation (All). With this modification equation (All) becomes 
<j>(x) = G IU(x) * ^ rrij p{x - Xj) - i J tp(x - x') d 3 x' 



= G 



mj(p(x ■ 



L 3 



— nL — x') d 3 x' 



£3 



^2 ¥3(x - Xj n ) - i / <p(x')d 3 x' 



<p(x- 



jn) - -^3 j <p(x - nL) d 3 x' 



(A16) 

(A17) 
(A18) 



where Xj n = Xj + nL and jn labels the image of the j-th particle in the image cell n. We may write equation (A18) as 



0(x) = G P (x)*^(x), 
where we now have for the density 

1 



P(x) 



3 



S(x-Xj) 



and equation (]A13|) is now 



^(x) = 



E 



<p(x -nL) - — 



\ J 3 <p(x'-nL) d 3 x' 



(A19) 
(A20) 

(A21) 



It is unnecessary for the mean value of both p and to be zero. The form for the modified pote ntial is convenient, however, 
since, for a potential which is Coulombic at large scales, with tp ~ 1/ r, th e form in equation (A21) is convergent whereas 
the form in equation (A13) is not: nor does the integral in equation (A17) converge. The Fourier synthesis of the peculiar 
potential is 
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(A22) 



A3 Particle softening 



The delta function in equation ( A£ ) may be replaced by any compact even function, with volume integral equal to unity, to 
represent a distribution of softened particles. (The softening will be considered fixed for the present to permit Fourier analysis.) 
However, since it is only the interaction of the particles via the gravitational field which is relevant, we may, equivalently, 
describe any particle soft ening by modifying the pairwise (Coulombic) potential, ip, at small separations. 
From equation (A.22) we can immediately write 



G 



E 

k/0 



e !kx p(k) k 2 (p(k). 



(A23) 



For a Coulombic interaction potential, <p(r) = — 1/r, we have <p(k) = — 4-7r/fc 2 . If we set (p(k) = -4irS(k)/k 2 with 5(0) = 1, 
then S(R) describes the departure of the i nteraction potential from Coulombic at small scales and, as we will see, plays the 
role of a particle softening. Equation (A.23) becomes 



V 2 ^ - ^$>' k -p(k)5(k) 

= 4ttGp(x) * S(x) 
= 4nGps 

which is Poisson's equation for a distribution of softened point charges; 

1 



ps(x) = 4-kG 



£ [S(x- Xj 



(A24) 

(A25) 
(A26) 



(A27) 



Note, that if we wanted the force on a softened particle from the distribution of softened particles (rather than merely 
sampling the density field at a point), the appropriate interaction potential in Fourier space would be (p = —4irS 2 (k)/k 2 . 



APPENDIX B: EQUIVALENCE OF 2-D AND 3-D SHEAR CALCULATIONS 
Bl General 

It is frequently assumed that for the purposes of determining deflections and shearing of light a three-dimensional mass 
distribution may be represented by a plane projection of the density. In particular, many workers in the field of gravitational 
lensing treat cosmological iV-body simulation cubes as collapsed planes. In this Appendix we investigate this assumption and 
show under what conditions the result holds. 

The result is approximate because of the need to apply the appropriate angular diameter distances at every deflection 
site (or evaluation position). In our derivation we assume that these factors are constant along the line of sight through the 
projected volume, whereas in practice they will vary slightly through the simulation volume. (The technique developed in 
this work applies the angular diameter distances at every evaluation position within each three-dimensional realisation, and 
evaluates the shear components at many locations within each to enable a complete description of the shearing of a light ray 
during its travel through the simulation.) 

We will show that computations based on two-dimensional (planar) projections of three-dimensional (periodic) simulations 
are adequate provided: (a) the mass distribution is periodic along the line of sight, and a single (full) period is included in 
the projection; (b) proper account is taken of the full transverse extent of matter, (which should normally be assumed to be 
periodic, unless strong lensing by matter limited in extent is being considered); by ignoring this requirement, it is likely that 
the convergence of deflection angles and shear components to their limiting values will not be achieved; (c) the net zero mass 
requirement is adopted. 

Consider the peculiar potential, <f>(x), in a three-dimensional periodic system as given above in equation (A18): 



js / V( x ' - nL ) 



(Bl) 



As discussed in Section A2, the integral in equation (Bl) is finite and the sum over n converges. Although we will be 
considering derivatives of the potential — in which case the second, constant, term drops out — it is useful to have a rigorous 
convergent expression for the peculiar potential in a periodic system. 

We are interested in obtaining the integrated shear along a line-of-sight over one period: f d 2 (f)(x)/dxidxj dz. We will 
begin by integrating the peculiar potential over one period: J 0(x) dz. 
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In the following we will split vectors over three dimensions into a two-dimensional component perpendicular to the line- 
of-sight and a component along the line-of-sight, here taken to be in the z direction. A superscript asterisk is used to denote 
2-dimensional quantities, e.g., x = (x*,z). Then the two-dimensional potential is 



0*(x*) 



(x) dz 



1 

1? 



L Jl? 



■ G 'y ^ rrij 'y ' J dz <p(x* — x^ n . , z 

■ G *y ^ m j J dz v?(x* — x* n . , z — z. 

in* 

■ G^rrij ip*(x* -xj n .) - i J <p*(x*' ~ n*L)d 2 x 



<p(x*' — n*L, z — sL) d 2 x*'dz' 



n*L, z) d x* dz' 



where 

V?*(x*) = / <^(x*,z) dz. 



(B2) 



(B3) 



Equation (B2) is the 2-dimensional analogue of equation (Bl), with the interaction potential, ip replaced by tp* . Thus the 
three-dimensional peculiar potential integrated over one period L in one dimension, gives the same result as that obtained 
from the projected (surface) density of particles with a 2-dimensional interaction potential arising from the projection from 
— oo to oo of the 3-dimensional interaction potential. The corresponding results for the shear components, d 2 (f>* /dxidxj, 
Xi,Xj ^ z, follow directly. 



B2 Special Case of a Coulombic Potential 

For the case of a Coulombic potential in 3-dimensions, ip = — 1/r, where r 2 = x* 2 + z 2 , tp*{x*) = — J dz/r diverges. Consider 
the two-dimensional potential over a finite range in z, < z < M, 



<Pm(x*) 



= -2 



ipdz 
M 



dz 



o Vx* 2 + , 



2 In | x*| - 2 In (m + y/x* 2 + M 2 



From equation ( B2 ) we can then write 

rif(x"-xj n .)-^ J <Pm(x*' - n*'L) d 2 x*' 



^»*(x*) 



lim < GV i 

M->oo ' — ' 



21n|x*— x* n *| — — / 2 In |x*' — n" L\ d 2 x*' 



(B4) 



(B5) 



Thus, for ip(r) = —1/r, the appropriate two-dimensional potential is (p*(r*) — 21n(r*) as expected. 



B3 Softening 

The discussion above applies to any suitably well-behaved interaction potential, ip. In particular, consider the case of a 
distribution of softened particles. This may be described in terms of an interaction potential which is Coulombic at large 
scales but which falls below 1/r at small scales. It is most convenient to derive the appropriate ip* in Fourier space. We may 
set <p(k) = ~4mS 2 {k)/k 2 with S(0) = 1 (see section (|A3|). The required two-dimensional Fourier transform is then 

tp*(k*) = J e lk x d 2 x* J <j}(x) dz 

= £(k*,0). (B6) 

Thus, for a given softening function S, the appropriate function ip* may be found, although an analytic solution may not 
be possible especially in view of the notorious difficulty of two-dimensional Fourier integrals. 
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APPENDIX C: THE EWALD SUMMATION METHOD 

In this appendix we turn to the numerical evaluation of sums for the potential and its derivatives in a periodic system such 



as those in equation (Bl). For a Coulombic potential J <p(x) d 3 x is divergent and the potential is only well defined (and 
convergent) if there is a uniform negative mass density to cancel the distribution of positive mass particles. Even if this 
condition is met the sum for the potential is only slowly convergent and difficult to compute numerically. Ewald (1921) 
proposed a method for computing such sums in the context of calculating lattice potentials of ionic crystals. The electrostatic 
problem suffers from exactly the same numerical difficulties as the gravitational problem (the pairwise potential in each case 
is Coulombic), and it is well known that naively summing over images of the fundamental cell gives an order-dependent result. 
Note that the requirement for zero total mass is the same as the requirement in calculating crystal energies that the total 
charge be zero. We derive below Ewald's method as it is applied to the problem of computing the gravitational potential and 
give expressions for the first and second derivatives of the potential — respectively the force and shear — and the total potential. 
We also demonstrate the relationship of the P 3 M technique to the Ewald method. 



CI The Ewald method 



Consider again a system of JV particles in a cube of side L. The density is given by equation (A20): 



2 



p(x) = > /// , 



S(x-Xj) - 



(CI) 



The second term on the right hand side of equation (CI) makes the mean density zero as required for the existence of a 
solution to Poisson's equation. As noted in section A5 , the delta function in equation ( |Cl| ) may be replaced by any compact 
even function, with volume integral equal to unity, to represent a distribution of particles with fixed softening. Alternatively, 
and equivalently, the interaction potential may be suitably modified. 



The gravitational potential at a point x within the cube is given by equation (A.18) 
ip(x - Xjn) - jp / y(x' - nl) d 3 x' , 



G 2J m 3 



(C2) 



where the notation is as in section A.2. Note that evaluating equation (C2) at a particle position, Xj, will include the self 
energy of particle i. 

The sum in equation (02) converges very slowly and is ill conditioned for numerical computation. Ewald (1921) proposed 
splitting the Coulombic potential into two components, 



<p(R) = y 1 {R) + V2{R), 



(C3) 



where the functional form of the split is chosen so that the first component is dominated by quickly converging local contribu- 
tions and the second contains the relatively smooth long range components of the field. The attenuation of high frequencies in 
the second component ensures rapid convergence of the corresponding sum when recast as a Fourier series. Ewald proposed 
taking (p\ — —erfc(r]R)/ R, ifi2(R) = — 1/R — <pi{R) = —&ci{r)R)/R, where erf and erfc are the error and complementary error 
functions respectively. The parameter r\ is chosen to optimize convergence of the resulting real- and Fourier-space sums. For 
the moment we will not specify the functional form and will continue with the description in equation (G3). 
If we ignore for the moment the mean contribution in equation (IC2), we can write the potential as 



(x) = Gy"Vj</?i(x- 



k 



(C4) 



The second term is a Fourier series sum over k = 2n\/L, 1 an integer triple, because of the periodicity of the system. Referring 
to the result in equation (A14) we see that the Fourier components <3>2k are given by 



$2k = p (k)£ 2 (k), 



(C5) 



where p'(x) = , <5(x — Xj). This is completely equivalent to the results of subsection |A2| but with ip replaced by ^2, the only 
difference being that there will be a much larger effective softening for 952. 

We will now ensure that the mean density is zero. It is not sufficient simply to set $2^=0 = in equation (CM) as part 
of the mean value of the field is contained in the first (real-space) term. The required potential is 



. „ G . nij 

(x) = G> mj<pi(x — Xj n ) - 



£ l(k = 0) + -g£ 



e <&2k- 



k^O 



The continuous Fourier transform of the potential in equation (C6) gives, after straightforward manipulation: 



#k) = G£(k) 



p'(k) - V.rM(k) =G£(k)p(k), 



(C6) 



(C7) 
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and thus, as required, the potential is the convolution of the total density, p, with the interaction potential, ip. Of course, for 
<p(R) = -l/R^ p(fc) = -4Tv/k 2 and we recover Poisson's equation in Fourier space. 



Substituting for p'(k) in equation (C5) and then using the result for <!>2k in equation ( JCq ) gives the final result 

. „ G ^ .m, 

(x) = G m,j<pi (x — x in ) - 



(k = 0) + -J| e !k - (x -^'m^ 2 (fc). 



(C8) 



j.k/O 



The potentials <px and <p2 can be chosen so that both sums in equation (CS) converge rapidly. 



For a given split of ip in equation (C3) we can calculate tpjjk) and hence efficiently calculate the potential at a point 
for a given particle distribution. From the expression in equation (C8) we may also derive expressions with similar desirable 
convergence properties for derivatives of the potential, such as the force and tidal field at x: 

d s cj> 



dec fj,^ dx f_^2 • "dx fj 



G E< 



0Vi(r) 



dr^dr^.-dr^. 



+ i s fc M1 fc M2 ...fc fls e lk ' (x Xj) mjip2(k) 



(C9) 



r — x — x : 



For the first and second derivatives of tpi(r) we have dipi/dr^ = (<£>i/r)r M and d fi/dr^dr^ = [(<p'i/r)' /r]r )Jr r v + (¥>i/r)<5 M „. 
From equation (29) we can immediately write 



2 ik.fx— x,) 

e J 



mjip2(k) 



(CIO) 



We can recast the first term as a Fourier series, since it is periodic; as before the coefficients for the Fourier sum are the same 
as those for the continuous transform. Using this result we obtain 



I j.k J 



mj<p 2 (k) }, 



(Cll) 



where the integral is over all space. Note that the first sum includes the mean k = component. Equation (Cll) leads directly 
to 



IS 



ik.(x-x 3 ) 



m,jk 2 ip — lim [fc 2 y>2(fc)] ^ ' i 



(C12) 



If the interaction potential is Coulombic on large scales, ~ —1/7?, then limfc_o k p2 ~ — 47r, provided that ipi/(p2 — > 
sufficiently rapidly with increasing i? (this condition is satisfied for any practical splitting choice). Equation (C12) then gives 

V72 



AnGp, 



including the negative mean density as in equation (CI). 

The total potential energy of the N particles in the box including interactions with all images is 

U = — ^2 mimjipixi - Xjn) 



(C13) 



(C14) 



(ignoring again temp orar ily the mean contribution to the peculiar potential in equation (C2)). We can make use of the same 
splitting in equation (C3) and the results that follow from it for the potential, provided that the condition j ^ i is observed. 
This is straightforward to take into account in the real-space sum by directly omitting the terms i = j. The Fourier sum, 
however, expresses the long-range component as a field which, to be correct at all locations, must contain the contribution 
of particle i. Measuring the potential at Xi using equation (C4) will, therefore, include the self-energy of particle i arising 
from the i nter action potential ip%. The self-energy contribution from particle i to the Fourier sum is Gm 2 ip2(R = 0)/2. Using 
equation (C8) we obtain, after a little manipulation, the result 



G f fc mj ) ~ 
U = —< / J rmmjipxi-x-j -x jn ) — A <^i(k = 0) 



+ -p- ^ g !k -(xi x j) m . mj<<g2 (fc) _ m 2 p2(R = 0) 



(C15) 
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C2 Practical issues in computing the Ewald sums 



A number of choices are possible for the interaction splitting in equation (C3). Of primary interest is good convergence of both 
the real-space and Fourier-space sums. It is also important to choose splitting functions which allow an efficient numerical 
scheme to be constructed. The most commonly used is that given by Ewald (1921), but a number of others have been discussed 
in the literature (e.g., Nijboer & de Wette, 1957). 

The Ewald scheme is a one-parameter family with <pi(R) = — erfc(r]R)/R, and <fi2{R) = —exi(r]R)/R. The Fourier 
transform (in three-dimensions) of if2 is <pi{k) = — 47re~ fc '/fc 2 . The parameter r\ is chosen to optimize convergence. 
Nijboer & de Wette (1957) show that both the real-space and Fourier-space series have the same rate of convergence if 
r\ = y/ir/L. Satisfactory results are obtained for a range of values near this result. Machine accuracy (32 bit) can typically be 
achieved with this value of r\ by extending the sums in real space and Fourier space to a radial distance of roughly five image 
cells or Fourier modes respectively. The analytic result is independent of the value of r\ although the rates of convergence of 
the sums are affected. Increasing r\ causes a faster convergence of the real-space sum whilst requiring the accumulation of a 
greater number of terms in the Fourier series to achieve a given precision. If r\ deviates too much from the optimal value, 
one of the sums will require the accumulation of a large number of terms for good convergence and we will be faced with the 
original computational problem which the splitting was introduced to solve. 

Clearly each computation of the potential (or force etc.) requires O(N) operations, and so to find the force on each of 
N particles, for example, requires 0(N 2 ) operations, and so is not competitive with other methods currently available. The 
present attraction of the method is its simplicity and that it enables forces to be calculated to high precision. 

Some impro veme nt in computational efficiency can be obtained by noting that the Fourier sums in equa- 
tions ( |CS| ) and ( |C14| ) can be factorized. For equation (38), for example, we can express ^\ k ^ Q e lk '^ x ~ x ^mj(£i2(fc) as 



Sk^o e ' k X( P 2 (k) (^2j m j e lkx J^. The sum over j can be precomputed and stored for the small number of wavenumbers 
fc required (typically a few hundred). This reduces the cost of the Fourier sum to an O(N) operation for a fixed number of 
wavemodes and allows the parameter rj to be increased, which reduces the work of the real-space sum. Note that since ip? is 
a real, even function, its Fourier transform will also be real and even. This allows the complex exponential to be reduced to a 
cosine for numerical computation of the sum. (The factorization just described is then only marginally more complicated — see 



section C4.) 



C3 Relationship to the P 3 M method 

The P 3 M algorithm uses an interaction splitting which can be described using the same terminology set out above. The 
splitting employed is analogous in many ways to choosing a very large value of r). This reduces the range over which the 
real-space sum must be accumulated and throws most of the work into the Fourier sum. In P 3 M the real-space sum is reduced 
to such an extent that the effective range is much smaller than the periodic distance, L. Indeed a different functional split is 
commonly used in which ipi is compact. This allows efficient techniques to be used in which only nearby particles need be 
included in the real-space sum. Since the effective range of the real-space sum is now much less than the period distance L, 
this part of the calculation now takes O(Nn) operations to compute, where n is the mean number of particles within the range 
of the function tp\. Provided n does not become too large (as it unfortunately will in gravitational simulations as clustering 
develops) the real-space work is essentially O(N). Pushing much of the work into the Fourier domain is only advantageous 
if efficient methods are available for accumulating the Fourier sum and if we can avoid the convergence problems discussed 
above in accumulating forces from the Fourier components. 



The Fourier part of the P 3 M method is best understood in terms of equation (C6) rather than equation (C8). The key to 
the method lies in approximating the Fourier components $2k- Instead of calculating afresh the Fourier sum for each value 
of x for which it is required, the result is interpolated from stored values discretized on a regular grid. Provided the field is 
adequately sampled, the error in this procedure can be reduced to acceptable levels. The interaction splitting must be chosen 
such that the number of grid points available is sufficient to represent the harmonic content of the field. Since the number of 
wavenumbers over which the field must be known may now be very large (a consequence of the short range of the real-space 
sum), an efficient method for calculating the Fourier components from the density field and interaction potential must be 
used. This is achieved by sampling the density field with a uniform grid and using a Fast Fourier Transform (FFT) technique 
for obtaining the potential. Using an FFT also ensures well-determined convergence properties for the Fourier sums. 



C4 Formula? for the Ewald method 



Using the splitting described by Ewald we will now write down explicit expressions for equations (C8) and (C14) which can 
be used to compute the potential, its derivatives and the total potential. 

We have ipi = —exic(rjR) / R, ip2 = —eri(riR)/R and ip2 = — 4-7re~ fc ^ Arl '/A: 2 . (The error function, erf(a;), is 
(2/^/tv) J* e _< dt. Many approximations to erfc suitable for efficient numerical computation exist in the literature and are 
often also available in mathematical libraries on many current computers.) To calculate fi(0) we must take the limit of the 
transform of —l/R + erk(r]R)/R as k — * 0. This gives tp i(0) = —Tv/r] 2 . The value of ^2(0) = — liniij^o erf (rjR)/R — —2t]/^/it. 
Recall that the Fourier modes are labelled by k = 27rl/L where 1 is an integer triple. Finally, define C(l) = . rrij cos(27rl.Xj/L) 
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and S(\) = ■ m j sin(27rl.Xj/L). Putting all of this together we find: 



-Gi ^m 3 erfc(r 7 r)/r| r=x _ 



1*0 



and 



^ E p [ C ® cos ( 27rl - x / 1 ') + S ( l ) sin(27rl.x/L)] e"" 2 ' 2 



(/ = — ^/ ^ ^ rriimjeTfc(r]r)/ 



I r— Xj — x^ 



7/ 2 L 3 



+^E^ c2 ( 1 )+^( 1 )]^ 2i2/(iV) -SE- 



1*0 



Derivatives of <j> may be calculated trivially from equation (C16). We have: 



and 



erfc(r;r) H —r\re 

V 71 " 



+ ^ E p sm (2^1.x/L) - 5(1) cos(2tt1.x/L)] e"" 2 ' 
Mo 



2 ' 2 /(lV) 



4 3 ->7 2 r 2 r M r„ / , . 2 
e — - — h erfc(?7r) + 



V r 3 r 5 / 



+ § E 7^ cos(2tt1.x/L) + 5(1) sin(2^1.x/L)] e^'/Ci- V 

1*0 



_2,2 // r 2 2, 



Note: asymptotically erfc(x) ~ l/( v / 7rx)e 



